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Abstract 

Many  methods  exist  for  fitting  ellipses  and  other  second-order  curves  to  sets  of  points  on  the  plane. 
Different  methods  use  different  measures  for  the  goodness  of  fit  of  a  given  curve  to  a  set  of  points.  The 
method  most  frequently  used,  minimization  based  on  the  general  quadratic  form,  has  serious  deficiencies. 
Two  alternative  methods  are  proposed:  the  first,  based  on  an  error  measure  divided  by  its  average  gradient, 
uses  an  eigenvalue  solution;  the  second  is  based  on  an  error  measure  divided  by  individual  gradients,  and 
requires  hill  climbing  for  its  solution. 

As  a  corollary,  a  new  method  for  fitting  straight  lines  to  data  points  on  the  plane  is  presented. 
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Introduction 

This  p>i|K  i  vii  -aisscs  the  following  problem;  Given  some  set  of  data  points  on  die  plane,  how  should  we  fit 
an  ellipse  to  these  points?  In  more  precise  terms,  let  curves  be  represented  by  some  equation  C(jr.;)=0.  We 
restrict  G(x,y)  to  be  a  polynomial  in  x  and  y  of  degree  not  greater  than  2.  The  curves  generated  by  such  a 
function  are  the  conic  sections:  ellipses,  hyperbolas,  and  parabolas.  In  the  special  case  where  is  of 

degree  1,  the  curve  represented  is  a  straight  line.  Now,  given  a  set  of  data  pairs  {(jr^,  <=  1 . n),  what  is  the 

function  G(x,y)  such  that  the  curve  described  by  the  equation  best  describes  or  tits  the  data?  The  answer  to 
this  question  depends  upon  how  we  define  "best." 

The  primary  motivation  for  studying  this  problem  is  to  deal  with  systems  that  use  light  stripes  to  measure 
depth  information  [Agin  76]  [Shirai]  [Popplestonc].  When  a  plane  of  light  cuts  a  cylindrical  surface  it 
generates  a  half  ellipse  in  the  plane  of  the  illumination.  When  this  ellipse  is  viewed  in  perspective  it  gives  rise 
to  another  partial  ellipse  in  the  image  plane.  The  incomplete  nature  of  this  curve  segment  makes  it  difficult  to 
measure  its  intrinsic  shape. 

A  similar  problem  often  arises  in  scene  analysis  [Render]  [Tsuji].  A  circle  viewed  in  perspective  generates 
an  ellipse  on  the  image  plane.  If  some  scene-understanding  procedure  can  identify  the  points  that  lie  on  the 
perimeter  of  the  ellipse,  these  points  may  be  used  as  the  data  points  in  a  curve-fitting  process  to  identifying 
the  dimensions  of  the  ellipse.  The  relative  lengths  of  the  major  and  minor  axes  and  the  orientation  of  these 
axes  will  then  be  sufficient  to  determine  the  plane  of  the  ellipse  relative  to  the  camera. 

Fitting  ellipses  and  other  second-order  curves  to  data  points  can  be  useful  in  interpreting  physical  or 
statistical  experiments.  For  example,  particles  in  bubble-chamber  photographs  may  follow  elliptical  paths, 
the  dimensions  of  which  must  be  inferred. 

It  is  easy  to  sec  how  a  fitter  of  ellipses  would  be  useful  in  an  interactive  graphics  or  a  computer-aided 
drawing  package;  i.e.,  the  user  could  indicate  a  rough  approximation  to  the  ellipse  or  circle  he  wants,  and  the 
system  could  infer  the  best-fitting  approximation.  This  kind  of  capability  is  currently  handled  by  fitting  with 
splines  [Smith]  [Baudelaire], 

It  is  important  to  distinguish  among  the  extraction  of  points  that  may  represent  the  boundary  of  an  ellipse; 
the  segmentation  of  collections  of  points  into  distinct  curves;  and  the  Jilting  of  these  points  once  they  have 
been  extracted.  This  paper  docs  not  purport  to  describe  how  to  determine  which  points  do  or  do  not  belong 
to  any  ellipse  or  ellipse  segment.  Curve  fitting  can  be  of  use  in  segmentation  and  extraction  to  evaluate  the 
reasonableness  of  a  given  hypothesis;  however  this  discussion  is  limited  to  methods  for  determining  the 
equation  of  the  curve  that  best  fits  a  given  set  of  data  points. 
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Representing  Second-Order  Curves 

An  ellipse  in  "standard  position",  such  as  the  one  in  Figure  1,  may  be  represented  by  die  equation 

a2  b 2 


Figure  1:  An  Ellipse  in  Standard  Position 

Such  an  ellipse  has  its  center  at  the  origin  of  coordinates  and  its  principal  axes  parallel  to  the  coordinate  axes. 
If  parameter  a  is  greater  than  parameter  b,  then  a  represents  the  length  of  the  semi-major  axis  and  b  represents 
the  length  of  the  semi-minor  axis.  The  eccentricity  (e)  of  the  ellipse  is  defined  by  the  formula 

where  e  must  be  positive,  and  between  zero  and  1.  If  a—b.  then  equation  1  represents  a  circle,  and  e  is  zero. 
If  a<b  then  b  represents  the  semi-major  axis  and  a  the  semi-minor,  and  e  is  defined  as 

A  shift  of  coordinates  allows  us  to  represent  an  ellipse  centered  on  a  point  other  than  the  origin,  say  (h,k), 
as  in  Figure  2.  If  we  let 

x  ’  =  x  -  h  (2) 

and  y '  =  y  -  k 

then  the  equation  of  the  ellipse  of  Figure  2  is 
x  '2  v  ’2 

7  +  7  =  1' 


(x  -  hf  ±  (y  -  k)1 
u2  b1 


(4) 


Figure  2:  An  Ellipse  off  the  Origin  of  Coordinates 


A  rotation  of  the  ellipse,  as  in  Figure  3,  can  be  accounted  for  by  tire  transformation 


x"  =  -x  ’cos  9  +  y  ’sin  9 
and  y"  =  -  x  ’sin  9  +  y  ’cos  9. 

These  transformations  can  be  substituted  directly  into  the  equation  for  an  ellipse,  but  we  prefer  the  implicit 


form: 


where 

and 


u2 


=  l 


x"  =  (x-h)  cos  9  +  (v—  k)  sin  9 
y"  =  —(.x-  h)  sin  9  +  (y—  k) cos  9  . 


(5) 
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Equation  5  can  represent  any  ellipse  in  any  orientation.  A  total  of  five  parameters  arc  involved:  a  and  b 
represent  the  dimensions  of  the  ellipse,  h  and  k  represent  its  center,  and  0  represents  its  rotation. 

The  equation  of  a  hyperbola  in  standard  position  is  similar  to  that  of  an  ellipse,  but  with  a  sign  change: 


Figure  4:  A  Hyperbola  in  Standard  Position 


A  hyperbola  is  shown  in  Figure  4.  Its  eccentricity  is  given  by 

The  center  of  the  hyperbola  can  be  moved  and  its  axes  rotated  by  transforms  similar  to  those  we  used  for 
ellipses.  We  can  represent  ellipses  and  hyperbolas  by  the  same  equation  or  set  of  equations  if  we  let 


eccentricity  into  the  equation.  Any  central  conic  (ellipse  or  hyperbola)  can  be  represented  as: 
"2  "l 

—  + - L -  =  1 

c?  <?{  i-e2) 

where  x"  -  (x- h)  cos  d  +  (y-  k)  sin  6 

and  y"  =  —  (*—  h)  sin  6  +  (y~k)  cos  0 


(6) 


It  should  be  noted  here  that  an  ellipse  can  also  be  represented  parametrically.  For  an  ellipse  in  standard 
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orientation,  points  on  its  perimeter  arc  given  by 

x  =  h  +  n  cos  <j>  (7) 

y  —  k-\-  b  sin  <£ , 

where  <p  varies  between  0  and  lit.  The  rotation  6  of  the  ellipse  can  be  taken  care  of  by  rewriting  equation  7 
as  follows: 

x  =  /;  +  a  cos  4>  cos  9  -  b  sin  <f>  sin  9 
y  -  k  +  a  cos  <p  sin  9  +  b  sin  <j>  cos  9 

A  hyperbola  may  also  be  represented  parametrically,  using  hyperbolic  functions.  Points  on  a  hyperbola  in 

standard  orientation  with  its  center  at  (h,k)  are  given  by 

x  -  h±  a  cosh  f 
y  -  k±  b  sinh  f  , 

The  value  of  £  may  vary  from  zero  to  an  arbitrary  upper  limit  The  various  permutations  of  the  ±  signs  give 
rise  to  the  four  branches  of  the  hyperbola. 

A  parabola  is  actually  a  conic  section  with  eccentricity  1,  but  if  we  try  to  represent  it  in  the  form  of 
Equation  6  a  division  by  zero  results.  It  is  better  to  represent  the  parabola  by  the  equation 
y  -  a  x2 

A  shill,  of  origin  and  a  rotation  give  the  form: 

f  =  a  x"2  (8) 

where  x"  -  ( x - h)  cos  6  +  (y-  k)  sin  6 
and  /'  =  -(.<-/i)sinfl  +  (y~  k)  cos  0 

Given  any  parameters  of  size,  position,  and  orientation.  Equation  6  or  Equation  8  can  be  rewritten  in  the 
form 

Gix.y)  =  A  x2  +  B  xy  +  Cy  +  D  x  +  E  y  +  F  =  0  (9) 

It  may  be  shown  that  all  conics  may  bv  represented  in  the  form  of  Equation  9. 

Purcell  (Purcell,  p.  130]  shows  that  Equation  9  represents  a  hyperbola  if  the  indicator,  B2  -  4  .(  C  is 
positive,  a  parabola  if  it  is  zero,  or  an  ellipse  if  it  is  negative. 

Furthermore,  the  parameters  of  Equations  6  or  8  may  be  recovered  by  the  following  procedure:  Apply  a 

rotation  6  in  which  9  -  45  degrees  if  A  =  C  and 

,  „  B 
tan  2  6  = - 

A  -  C 

if  .1  *  C.  lliis  transforms  Equation  9  into  an  equivalent  form  in  which  B  (the  coefficient  of  the  xy  term)  is 
zero.  It  is  then  a  straightforward  matter  to  extract  the  other  four  parameters. 
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Minimization  and  Approximation  Theory 

Approximation  theory  is  a  mathematical  discipline  that  addresses  curve  fitting  [Rivlin].  Usually,  a  set  of  n 

data  points  are  specified  as  pairs  of  the  form  {  (x-.y/  i=  1 . n  },  where  x  is  regarded  as  an  independent 

variable  and  yi  represents  values  measured  at  n  particular  values  of  x.  Uct  the  symbol  v  denote  the  set  of  given 

data  pairs.  Let  Fbc  the  set  of  all  functions  defined  on  {x^  i  =  l . /?}.  V  is  thus  an  n-dimcnsional  linear 

space,  and  v€  V. 

Admissible  solutions  to  curve  fitting  problems  are  usually  represented  in  the  form  y= f(x).  The  set  of  all 
admissible  solutions  constitutes  a  subspacc  W  of  V,  whose  dimensionality  corresponds  to  the  number 
parameters  used  to  characterize  f.  For  example,  the  set  of  all  quadratic  functions  of  one  variable  constitutes  a 
space  of  dimensionality  three.  Given  some  w  €  W  we  need  a  measure  of  the  difference  between  w  and  v, 
which  we  denote  as  [tv —  v),  the  norm  of  w—  v.  The  norm  may  be  defined  in  the  Euclidean  manner  as  the 
square  root  of  the  sum  of  the  squares  of  w-  v,  where  summation  is  over  all  values  of  x  for  which  both  w(x) 
and  v<x)  are  defined.  Another  norm  in  frequent  use  is  the  maximum  of  all  elements  of  w—  v,  again  over  all 
points  where  both  functions  arc  defined. 

A  central  theorem  of  approximation  theory  states  that  there  exists  some  tv*  such  that 
|tv*—  v|  <  | tv —  v| 

for  all  w€  IK  When  we  use  the  Euclidean  norm,  we  say  the  minimizing  w*  is  the  best  approximation  in  the 
least-squares  sense.  If  the  norm  is  the  maximum  of  all  elements  of  tv-  v,  the  minimizing  tv*  is  referred  to  as 
the  best  uniform  approximation. 

The  paradigm  outlined  above  can  be  generalized  to  several  dimensions.  For  example,  given  triples  of  the 
form  {  x,  yt,  r,  i=  1, ....  n  }  and  a  space  of  functions  n(x,y)  we  may  find  tv*  that  minimizes  (in  the  appropriate 
sense)  the  difference  between  wfx,^)  and  z..  But  however  many  dimensions  there  are,  the  basic  assumption 
remains:  that  tv  is  a  single  valued  function  of  one  or  more  independent  variables. 

It  is  difficult  to  represent  an  ellipse  as  a  single-valued  function.  Therefore,  the  "difference"  between  a  data 
point  and  an  ellipse  is  not  uniquely  and  unambiguously  defined.  Intuitively,  the  difference  should  represent 
the  perpendicular  distance  from  the  point  to  the  curve.  If  an  ellipse  were  represented  in  the  form  y~f(x), 
then  /  would  be  multivalued  over  some  range  of  x,  and  have  no  value  elsewhere.  Usually  ellipses  arc 
represented  implicitly  by  equations  of  the  form  g(x.y)=0.  We  might  choose  a  norm  that  estimates  the 
magnitude  of  g  itself,  (i.c.,  it  measures  the  difference  between  g  and  zero.)  and  search  for  a  g*  that  minimizes 
that  norm.  Rut  the  "classical"  techniques  of  approximation  theory  arc  no  longer  applicable,  so  we  must 
develop  other  techniques. 
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Choosing  an  Error  Function 

1  ’  i.tyi  f  '!  fUiiv  *<>r  cflmse  fitting  is  as  follows:  First,  choose  a  method  of  estimating  the  "error"  of  a 
point  with  icspect  to  any  gi\cu  second-order  curve;  second,  choose  a  method  of  calculating  an  aggregate  error 
from  all  the  individual  errors;  third,  systematically  search  for  die  ellipse  that  minimizes  the  aggregate  error. 
The  choice  of  an  error  measure  and  an  aggregating  rule  affects  not  only  the  solution,  but  also  die 
computational  effort  needed  to  obtain  the  solution. 

It  should  be  noted  that  any  five  arbitrary  points  on  die  plane  arc  sufficient  to  specify  a  second-order  curve. 
As  long  as  no  three  of  the  five  points  are  coplanar,  there  exists  a  unique  second-order  curve  that  passes  exaedy 
through  each  of  the  five  points.  An  algebraic  procedure  exists  for  finding  this  curve  [Bolles].  .  More 
sophisticated  methods  become  necessary  only  when  there  are  more  than  five  data  points  to  be  fit. 

If  all  the  data  points  lie  on,  or  very  close  to.  a  madiematically  perfect  curve,  then  almost  any  mediod  for 
fitting  ellipses  will  give  acceptable  results.  In  practice,  problems  usually  arise  when  the  a  become  noisy 
and  dispersed.  Very  eccentric  ellipses  arc  harder  to  fit  than  nearly  circular  ones.  Cases  re  only  a  portion 
of  the  complete  curve  is  represented  by  data  points  generally  create  problems:  th  complete  the 
perimeter  the  greater  the  difficulty  of  estimating  the  curve  to  represent  it. 

For  the  rest  of  this  discussion,  we  will  consider  only  a  Euclidean  norm.  In  other  words,  we  are  restricting 
our  attention  to  least-squares  methods.  This  reflects  a  desire  to  let  the  solution  represent  an  "average"  of  all 
the  data,  rather  than  being  influenced  primarily  by  the  outlying  points,  as  would  be  die  case  if  we  used  a 
uniform  norm. 

Using  the  General  Quadratic  Form 

One  possible  choice  of  an  error  function  is  die  general  quadratic  form  of  a  second-order  curve  as  given  in 
Equation  9.  Wc  must  avoid  die  trivial  solution  A  =  B  =  C  =  D  =  F  =  F  =  0.  so  we  arbitrarily  assign 
F  =  1.  This  gives 

<j(x>)  =  A  jr  +  B  xy  +  Cy2  4-  D  x  +  F.  y  +  I  =  0  .  (10) 

Given  a  data  point  (x-.y).  we  let  die  pointwisc  error  be  given  by 

=  A  V  +  B  Vi  +  CK  +  D  -r. +  E-v\  + 1  • 

The  aggregate  error  is  given  by 
Z  =  2  £,2 

=  2  (  .1  x  ~  +  ft  x  y  -i-  C  v."  -f  D  x  +  F  v  +  1  )2  (11) 

Obtaining  the  partial  derivatives  of  Equation  !i  with  respect  to  A.  ft.  C.  D.  and  and  setting  these  to  zero. 
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we  obtain  the  following  system  of  equations: 

A  Zx4  +  B  Zx2y  +  C  Zx2y2  +  D  Zx2  +  E  2j c2y  +  Zx2  =  0 

A  2.v3>'  +  B  2  .O'3  +  C  Zxy2  +  D  2 a';'  +  E  2a>"  +  Zxy  =  0 

A  Zx2}'2  +  B  Zxy2  +  C  Zy4  +  D  Zxr  +  E  Zy2  +  Zy2  =  0  (12) 

A  2a3  +  B  2 x2y  -I-  C  Zxy2  +  D  Zx2  +  E  Zxy  +  2  x  =0 

A  Zxy  +  B  Zxr  +  C  Zy3  +  D  Zxy  +  E  2 y2  +  Zy  =0 

The  solution  to  these  equations  represents  the  the  ellipse  that  minimizes  the  error  function  given  in  Equation 

11. 


Figure  5:  Fit  Obtained  by  Minimizing  Equation  11 


Figure  6:  Fit  Obtained  by  Minimizing  Equation  11 


Figure  5  shows  a  set  of  computer-generated  data  points  and  the  curve  generated  by  this  method  to  lit  it. 
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The  method  appears  to  work  adequately  in  this  c  Out  Figure  fa  shows  another  case,  where  die  minimizing 
ellinsc  clearly  misses  the  data  points  near  the  origin  of  coordinates.  What  we  are  seeing  is  the  result  of  a  poor 
choice  of  error  function.  When  we  went  from  the  ellipse  representation  of  Equation  9  to  dial  of  Equation  10 
by  fixing  F  to  be  1,  we  allowed  the  representation  to  become  degenerate;  we  lost  the  ability  to  represent  an 
ellipse  that  passes  through  die  origin.  An  ellipse  as  represented  by  Equation  10  that  passes  close  to  the  origin 
must  have  large  coefficients  A.  B,  C,  D,  and  E\  hence  die  error  measure  Z  of  Equation  11  will  be  large. 
Therefore,  minimizing  Z  implies  keeping  the  curve  away  from  the  origin. 

A  requirement  of  a  useful  curve  fitting  method  is  that  it  should  be  independent  of  scaling,  translation,  or 
rotation  of  the  data  points.  That  is.  the  choice  of  a  coordinate  system  should  not  affect  the  solution  curve; 
except,  of  course,  that  the  solution  curve  should  be  scaled,  moved,  or  rotated  along  with  die  data  points. 

The  Average  Gradient  Constraint 

Ideally,  the  error  function  we  choose  to  minimize  should  be  related  to  the  distance  from  a  point  to  the 
curve.  Suppose  we  were  to  choose  some  primitive  error  measure  such  as  the  G(x.y)  given  in  Equation  9.  G  is 
zero  along  the  curve,  and  its  magnitude  increases  when  we  measure  G  at  points  farther  and  fardier  from  the 
curve.  For  a  point  in  a  small  neighborhood  die  curve.  G  is  proportional  to  die  perpendicular  distance  from 
the  point  to  the  curve.  The  constant  of  proportionality  is  the  reciprocal  of  die  magnitude  of  dn  gradient  of  G.  v 


We  will  choose  a  constraint  on  the  coefficients  of  Equation  9  such  that  the  average  gradient  is  unity.  Then 
the  resulting  error  function  will  be  directly  related  to  the  distances  from  points  to  curves. 

A  shift  in  notauon  will  make  the  following  mathematics  easier.  Define  the  vectors  X  and  V  to  be 


'  .x2  • 

A 

/ 

B 

and  V  = 

C 

X 

D 

y 

E 

i  J 

.  F  . 

Then  we  may  rewrite  Equation  9  as 
t(xj)  =  VT  X  =  XT  V  . 

Using  the  Euclidean  norm,  our  aggregate  error  Z  is  given  by 

-  =  2  £,2  =  Z  G2  =  21  VTXXTV)  =  VTZ(XXT)V  =  VT  1>  V  .  (13) 

P  =  2  X  Xr  is  a  matrix  of  sums  of  powers  of  v  and  \ ,  whose  first  five  rows  and  columns  are.  in  fact,  die 
coefficients  of  .1.  II,  C.  D,  and  F.  in  Equation  12  and  whose  last  column  provides  die  constant  terms. 
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The  magnitude  of  die  gradient  of  (7,  |v  <>|  may  be  determined  from  die  partial  derivatives  of  6  with  respect 
to  x  and  y. 

^syT»sVTx 

dx  dx 

=  VTX 
dy  dy  y 

where 


2x 

0 

y 

X 

0 

X  = 

v 

2 y 

1 

J 

0 

0 

1 

.  0  . 

.  0  . 

(VQ2  =  (— )2  +  (— )2  =  VT(X  X  T  +  X  X  T)  V 
ax  a y  y  y 

2(vG)2  =  VT  2(XxXxT+XyXyT)V 

=  vtqv 

Q  =  2  (X  X  T  +  X  X  T)  is  another  matrix  summed  from  powers  of  x  and  y.  The  mean-square  gradient  of  G, 
XX  y  y 

measured  at  all  data  points  { (x^),  i  =  1 . n  }  is  2  (V  (7)2  /  n.  Requiring  this  "average  gradient  magnitude" 

to  be  unity  is  equivalent  to  specifying 

VT  Q  V  =  n.  (14) 

We  wish  to  find  the  vector  V  that  minimizes  the  matrix  product  VT  P  V,  under  the  constraint  that  V1  Q  V 
=  n.  ft  is  well  known  [Courant]  chat  at  the  constrained  minimum  there  exists  some  I  aGrangc  multiplier,  X, 
such  that 

P  V  =  X  Q  V  (15) 

This  equation  would  be  easy  to  solve  by  normal  eigenvalue  mediods  were  it  not  for  the  tact  that  Q  is  a 
singular  matrix,  and  P  is  nearly  singular.  (It  seems  that  the  closer  the  data  points  approximate  a  conic  section, 
the  closer  P  approaches  singularity.)  The  appendix  gives  a  method  for  solving  Equation  15  diat  yields  five 
eigenvalues  {Xj,  i  =  1 . 5}  corresponding  to  five  eigenvectors  {V.,  i=  1 . 5}. 

j 

To  determine  the  aggregate  error.  Equation  13,  we  may  use  Equations  15  and  14  to  produce  the  result 
1  =  VTPV  =  XVtQV  =  Xu. 
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Then  we  know  that  the  coefficients  of  the  quadratic  function  giving  the  minimum  aggregate  error  unucr  the 
given  constraint  .ire  given  by  the  eigenvector  corresponding  to  the  smallest  eigenvalue. 

Solutions  to  the  curve  fitting  problem  are  invariant  with  translation,  rotation,  and  scaling  of  die  input  data 
A  proof  of  this  is  presented  in  Appendix  B. 


Figure  7:  Curve  Fitting  with  Average  Gradient  Constraint 

Figure  7  show  the  same  data  points  that  were  used  for  Figure  6  fit  using  the  "eigenvalue”  method 
described  above.  Comparing  figures  6  and  7.  shows  that  the  new  method  gives  superior  'results. 

Some  Difficulties 

The  problem  of  curve  fitting  gets  worse  when  the  points  to  be  fit  represent  only  part  of  an  ellipse.  Noise 
and  digitization  error  accentuate  the  problem. 

Figures  8  through  10  show  increasingly  difficult  cases.  The  data  points  for  Figure  8  are  a  subset  of  those 
used  to  generate  Figures  6  and  7.  There  is  a  noticeable  flattening  of  the  solution  curve,  but  not  so  much  that 
if  wc  had  no  knowledge  of  how  the  points  were  generated  we  would  say  die  fit  was  "wrong."  flic  misfit  in 
Figure  9  is  more  apparent.  The  same  ideal  ellipse  as  before  was  used  to  generate  the  points,  but  a  "fattening" 
of  the  data  points  has  been  simulated.  Figure  10  represents  an  extreme  case.  The  data  points  were  not 
generated  theoretically,  but  arc  from  an  actual  light-stripe  experiment  [Agin  72). 

What  wc  are  seeing  is  a  systematic  tendency  for  the  solutions  to  flatten,  becoming  elongated  ellipses 
parallel  to  the  general  linear  trend  of  die  data  points.  Hie  tendency  arises  from  the  fact  that,  all  other  dungs 
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Figure  8:  Curve  Fit  to  a  Short  Segment 


Figure  9:  Curve  Fit  to  a  Short,  Fattened  Segment 

being  equal,  the  error  of  a  scatter  of  points  about  a  curve  G(x,y)  =  0  depends  on  die  second  derivative  of  the 
error  function  G.  That  is,  a  function  whose  gradient  varies  rapidly  tends  to  "fit"  better,  in  a  normalized  least- 
squares  sense,  than  a  function  with  a  constant  gradient.  Flattened  ellipses  and  hyperbolas  are  characterized 
by  a  high  second  derivative  of  their  defining  function.  The  curve  fitting  solution  chooses  these  squashed 
curves  over  the  more  intuitive  curves  we  would  prefer. 

The  problem  is  not  limited  to  fitting  with  the  average  gradient  constraint.  Lyle  Smith  (Smith)  noted  die 
same  phenomenon  using  the  general  quadratic  form,  i.e.,  minimizing  Equation  11. 
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Figure  10:  Curve  Fit  to  a  Gently-Curving  Segment 

It  is  tempting  to  try  some  method  that  would  keep  the  general  idea  of  constraining  the  average  gradient,  for 
example  by  computing  that  average  over  the  entire  curve  instead  of  over  all  the  data  points.  This  would 
amount  to  a  constraint  on  the  coefficients  A  through  /-'of  equation  9  independent  of  the  data  points.  A  little 
thought  will  show  that  this  approach  will  not  work  at  all.  The  RMS  error  can  be  made  arbitrarily  small  by 
choosing  a  very  large  and  very  elongated  ellipse  with  a  gradient  magnitude  near  unity  along  most  of  its  length, 
but  a  vanishingly  small  gradient  magnitude  in  the  vicinity  of  die  data  points. 
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Curve  Fitting  by  Hill  Climbing 

The  best  measure  of  the  goodness  of  fit  of  a  point  or  set  of  points  to  a  given  mathematical  curve  G(.\.y)  =  0 
is  provided  by  measuring  the  perpendicular  distance  from  each  point  to  die  curve.  A  reasonable 


approximation  to  that  distance  may  be  had  by  dividing  the  error  function  G(xj- )  by  the  magnitude  of  die 


n 

c 


radient  of  G  measured  at  (x,v ).  With  such  a  definition,  aggregate  error  Z  is  given  by 
G(x.y)  ‘  ^  2 


) 


|vG'Uy)| 

-  y  yT  x  xT  v 

vT  (  x  x 1  +  x  x  T )  v  ’ 

xx  y  y 

where  V,  X,  X  ,  and  X  are  the  same  as  in  the  previous  section. 
*  y 


(16) 


The  point-by-point  division  makes  it  impossible  to  move  the  summation  sign  inside  the  matrix  product  as 
we  did  in  the  previous  section.  Minimizing  Equation  16  will  require  a  hill-climbing  approach.  We  must 
postulate  a  coefficient  vector  V,  use  it  to  evaluate  Z,  then  choose  another  V  to  see  whether  or  not  it  improves 
the  error  I,  etc. 


Even  though  there  arc  six  elements  in  the  vector  V,  there  are  really  only  five  independent  parameters 
necessary  to  specify  an  ellipse.  The  hill  climbing  algorithm  will  manipulate  these  five.  We  are  free  to  specify 
these  parameters  in  any  way  we  choose.  We  only  require  that  it  be  possible  to  derive  V  uniquely  from  these 
parameters.  For  example,  we  could  choose  to  optimize  over  a,  e,  d,  h,  and  k  given  in  Equation  6.  A 
somewhat  better  approach  is  to  represent  the  ellipse  in  the  form 

a  (jc-  h )2  +  P  (x-fiXj’-  *)  +  7  O'—  k)2  =  1  (17) 

and  optimize  over  a,  /?,  y,  h  and  k.  This  formulation  avoids  degeneracy  in  6  (orientation)  when  the  ellipse  is 
nearly  circular. 


Hill-climbing  must  start  with  some  initial  guess  as  to  the  approximating  ellipse.  The  easiest  way  to  do  this 
is  to  choose  three  data  points,  preferably  at  both  ends  and  near  the  middle,  and  calculate  the  circle  that  passes 
through  these  three  points.  Hill-climbing  tends  to  preserve  the  form  of  the  initial  guess.  If  the  initial  guess 
represents  an  ellipse,  the  method  will  not  converge  to  a  hyperbolic  solution.  A  roughly  circular  ellipse  will  not 
be  transformed  to  a  drastically  elongated  one. 

The  minimization  problem  is  rather  ill-conditioned.  Care  must  be  exercised  to  use  the  correct  numerical 
technique,  or  the  results  will  be  poor.  We  have  tried  several  methods.  It  turns  out  that  the  method  of  steepest 
descent  with  accelerated  convergence  is  totally  unacceptable.  It  may  take  many  minutes  of  computer  time  for 
the  method  to  converge,  if  at  all.  Evaluation  of  the  gradient  of  Z  docs  not  appear  to  help  appreciably.  The 
only  method  that  gives  acceptable  results  requires  evaluating  the  matrix  of  second  partial  derivatives  of  Z, 
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then  finding  die  eigenvectors  of  that  matrix.  The  complete  method  is  given  in  Appendix  C. 

We  .lull  not  attempt  to  prove  formally  that  results  obtained  from  hill  climbing  on  die  expression  given  by 
Equation  16  are  independent  of  position,  orientation,  and  scale.  Instead  we  shall  appeal  to  an  mu  iiive 
understanding  of  an  error  function  and  its  gradient.  The  cnr>r  function  should  not  be  affected  by  changes  of 
coordinates,  nor  should  its  gradient.  A  change  of  scale  will  affect  the  error  function  and  its  gradient,  but 
should  multiply  them  by  the  same  constant  value  everywhere.  Hence,  a  local  minimum  will  stay  a  local 
minimum  under  translation,  rotation,  and  scaling.  Depending  on  the  particular  hill-climbing  method  used, 
there  may  be  some  dependence  of  convergence  properties  on  scaling  and  rotation. 


Figure  11:  Hill-Climbing  Curve  Fit 


Figure  12:  Hill-Climbing  Curve  Fit 


Figure  13:  Hill-Climbing  Curve  Fit 


Figures  11  through  13  show  the  data  points  of  Figures  8  through  10  fitted  by  hill  climbing  with  an  initial 
circular  estimate.  Figure  11  is  approximately  equivalent  to  Figure  8.  Figure  12  shows  a  more  noticeable 
improvement  with  respect  to  Figure  9.  While  the  result  doesn't  come  near  the  ellipse  from  which  the  data 
points  were  generated  (cf.  Figure  7),  the  fit  at  the  lower  end  of  the  data  points  is  more  "intuitive."  In  the  case 
of  Figure  13,  the  improvement  is  dramatic. 
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Applying  the  Gradient  Constraint  to  Straight  Lines 

The  i< 'vii,;  >.!■;  !■!-:  ts  n  rcssion  from  the  main  topic  of  fitting  tsecond-order  curves.  A  new 
founulation  of  straight-line  luting  is  obtained  when  we  apply  the  methods  developed  here  10  the  linear  case. 


A  straight  line  is  defined  by  the  equation 
G(x,y)  =  Ax +By+C  —  0. 

We  define 


so  that  we  may  rewrite  Equation  18  as 

G(x,y)=  VTX  =  XTV  =  0. 
We  seek  to  minimize  the  error  function 
2  =  2  f.2  =  2  G2  =  VT  P  V 

where 


P  =  ZXXT  = 


Z*2  Zxy 
Zj ty  Zy2 
Zjc  Z y 


Zx 

n 


(18) 


X  = 

X 

y 

and  V  = 

A 

B 

l 

C 

The  magnitude  of  the  gradient  of  G  is  constant  for  all  x  and  y,  and  is  equal  to  the  square  root  of  A2  +  B2. 

1  0  0 
0  1  0 
0  0  0 


(VC)2  =  VT 


V  =  V1  Q  V 


If  the  gradient  is  constrained  to  unity,  then  the  error  function  G(x,y)  will  be  precisely  equal  to  tire 
perpendicular  distance  from  (x,y)  to  the  line  G  -  0. 


Just  as  in  the  second-order  case,  the  vector  V  that  minimizes  Z  subject  to  the  given  constraint  must  be  a 


solution  to  the  eigenvalue  equation 
P  V  =  X  Q  V. 

Some  algebra  yields  the  pair  of  solutions 


A 

1 

-5 

V  = 

B 

=  —n - r  x 

r  -  X 

C 

Vs “  +  {r— X)2 

(sZx  -  (r— X)Z>)/n 

1 


i  -  X 
-  s 

-(/— X)  Zx  4-  sZy)/n 


V 19) 


V(/-X)2  +  r 


x 


CO) 
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where 

r  =  Zx2  -  (2 x)2/n 
s  =  Zxv  -  Zx  Zy  /  n 
r  =  Zy2  -  {Zy)2/n 

A  =  -L(r+t  -  \J{r-  r)2  +  4r  ). 

The  two  forms  are  mathematically  equivalent  unless  s=0,  in  which  ease  one  form  or  the  other  will  involve  a 
division  by  zero.  For  this  reason,  Fquation  19  is  to  be  preferred  whenever  r  is  greater  than  /,  and  F.quation  20 
when  the  reverse  is  true.  Once  A  and  B  have  been  computed  using  either  form,  C  may  be  easily  computed  as 
— (TZjc  +  BZy)/n.  The  mean-square  error  of  the  fit  is  equal  to  X/n. 


Figure  14:  Straight  Line  Fit  Minimizing  Vertical  Distances 

Figures  14  and  15  show  a  startling  comparison  between  the  traditional  method  of  fitting  straight  lines  and 
the  method  presented  above.  The  data  points  show  a  wide  scatter  about  a  nearly-vertical  line.  The  line  in 
Figure  14  was  fit  using  the  traditional  linear  regression  formulas,  where  a  line  is  represented  by  die  equation 
y  =  M  x  +•  B 

and  M  and  B  arc  calculated  as 

_  /V  Zxy  -  2jc  Zy 

-V  Zx2  -  (  Zx  )2 

_  Zjr2  Zy  -  Zx  Zxy 

N  Zx 2  —  (  Zx  )2 

The  straight  line  of  Figure  15  was  based  on  the  line  representation  of  Equation  18  and  the  solution  of 
Equation  19. 


Figure  15:  Straight  Line  Fit  Minimizing  Perpendicular  Distances 

A  failure  of  a  "tried  and  true"  method  deserves  some  analysis  and  discussion.  In  this  case,  the  failure  is 
traceable  to  the  assumption  that  *  is  the  independent  variable,  that  v  depends  on  x.  But  when  the  trend  of  the 
data  is  nearly  vertical,  it  may  be  that  x  is  more  a  function  of  y.  A  vertical  line  is  degenerate  using  the 
regression  formulas.  If  it  makes  sense  for  a  collection  of  points  on  the  plane  to  approximate  a  vertical  line, 
then  we  should  not  use  linear  regression. 

I  have  not  seen  this  formulation  published  anywhere  else.  1  would  appreciate  anyone  who  has  seen  this 
result  published  elsewhere  letting  me  know. 

Conclusions 

Three  methods  for  fitting  second-order  curves  to  sets  of  data  points  on  the  plane  have  been  presented  and 
analyzed.  These  methods  arc  distinguished  principally  by  the  way  they  measure  the  amount  of  misfit  between 
a  given  curve  and  a  given  set  of  points.  The  three  measures  arc: 

1.  the  quadratic  form,  with  the  constant  term  set  equal  to  1  (Equation  11), 

2.  the  quadratic  form  (Equation  13)  subject  to  the  average  gradient  value  being  held  to  1  (Equation 
14), 

3.  the  quadratic  form  divided  by  the  gradient  magnitude  at  each  point  (Equation  16). 

As  may  be  expected,  the  three  measures  lead  to  different  results  when  minimized.  The  first  measure  has 
been  shown  to  be  sensitive  to  translation  in  the  plane,  and  to  give  grossly  incorrect  results  under  certain 
conditions.  The  second  measure  has  been  formally  shown  to  be  insensitive  to  translation,  rotation,  and 
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scaling,  and  reasons  have  been  given  why  the  third  measure  ought  to  be  the  same.  The  third  measure  has 
been  shown  to  give  somewhat  better  results  than  the  second,  particularly  in  difficult  eases  with  small  angular 
arcs  and  widely  scattered  data  points. 

The  three  measures  also  lead  to  very  different  computational  procedures  for  their  minimization. 
Minimizing  measures  1  and  2  both  require  summing  products  of  x  and  y  up  to  the  4th  power;  in  this 
summation  they  areO (n),  where  n  is  the  number  of  data  points.  But  for  fewer  than  100  data  points,  the  major 
use  of  computation  time  is  in  solution  of  the  simultaneous  linear  equations  (for  measure  1),  or  the  eigenvalue 
solution  (for  measure  2).  On  a  Digital  Equipment  Corporation  2060  computer,  generation  of  Figures  such  as 
6  and  7  typically  require  about  50  milliseconds. 

On  the  other  hand,  measure  3  is  very  expensive  computationally.  Computation  time  is  a  direct  function  not 
only  of  the  number  of  data  points,  but  also  of  the  initial  solution  estimate  and  the  accuracy  required. 
Generation  of  Figures  11  and  12  required  24  and  42  seconds  respectively.  Hence  hill  climbing  is  to  be 
recommended  only  when  all  other  methods  prove  inadequate. 
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Appendix  A:  Solution  of  the  Generalized  Eigenvalue  Equation 

\V\  v  i  ;o  solve  the  gcnerali/cd  eigenvalue  equation 
PV  AQV. 

given  that  Q  is  singular  and  P  may  be  close  to  singular.  The  following  method  was  derived  by  Richard 
Underwood. 
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r 


UT  Z  +  a  W  =  0. 

This  may  bo  solsed  co  give 

I  tT  y 

W=-  — .  (22) 

a 

The  top  five  rows  of  Equation  21  represent  the  vector  equation 
C  Z  +  U  IV  =  A  L 

into  which  we  may  substitute  our  result  for  IT.  Equation  22,  to  yield 

(C- IuUT)Z  =  XZ.  (23) 

a 

Equation  23  may  be  solved  by  usual  eigenvalue  methods,  such  as  the  Q-R  algorithm  [Isaacson).  Given  a 
particular  solution  Z.,  the  corresponding  \f  is  given  by 
Z 

V  =  lt  -- 


where  W  is  given  by  Equation  22. 

Appendix  B:  Rotational,  Translational,  and  Scaling  Invariance 

The  rotational,  translational,  and  scaling  invariance  of  the  method  may  be  shown  as  follows:  Let  there  be  a 

coordinate  system  ( u.  v)  related  to  (x.y)  by  the  transformations 

u  =  ax+by+c  (24) 

v  =  Tr  +  fy+/. 

We  may  define  a  vector  U  analogous  to  X  such  that 


=  HX  = 


ul  <r  lab  bl  lac  Ibc  <r 

tiv  ad  ae+be  be  af+cd  bf+ce  cf  xy 

U  =  v2  =  HX  =  d2  Ide  e2  Idf  lef  }  x  /  . 

u  0  0  0  a  i>  c  x 

v  0  0  0  d  e  f  y 

.  1  0  0  0  0  0  lJ  1  . 

Just  as  a  vector  V  defines  an  error  function  Gx.y )  =  VT  X.  a  vector  V  can  define  an  error  function  G  ’  =  V  T 
U.  Equating  the  two  error  functions  (for  all  X)  yields  the  relationship 
V  =  HtV’. 

Now,  suppose  that  for  some  collection  of  data  points  {Xj,  i=  l....n}  V  minimizes  the  aggregate  error  given 
by  Equation  13.  We  have  shown  that  V  satisfies  Equation  15.  and  that  the  corresponding  eigenvalue  A  is  the 
smallest  of  all  eigenvalues  that  satisfy  the  equation. 


Rewriting  Equation  15  and  performing  some  algebra  gives 
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P  V  =  A  Q  V 

S  (X  Xr)  V  =  X  2  (XX  XXT  +  XV  Xyr)  V 
2  (X  X 1 HT  V  ')  =  A  2  (X  X  1  IiT  V  +X  Xth  'V) 

x  x  y  y 

2 (H  X  XT  HrV  )  =  X  2(H  X  X  r  HT  V  +  II  X  X  T1IT  V ') 

xx  y  y 

2  (U  UT  V  ')  =  X  2  (Ux  UxT  V  ’+  Uy  UyT  V  ') 

where  Ux  and  Uy  denote  the  partial  derivatives  of  U  with  respect  to  x  and  y,  respectively.  If  the 
transformation  of  Equation  24  is  an  orthonormal  transformation,  that  is,  if 
a2  +  A2  =  c2  +  J1 
and 

ad  +  be  —  0 , 
then  it  may  be  shown  that 

U  1TtU  U  t  =  (a1  +  b2)  (U  U  T  +  U  U  T) . 
xx  yy  uu  vv7 

Substituting  this  result  in  die  above, 

2  (  U  Lt  )  V  '=  (<r  +  b 2)  \  2  (U,  U  T  +  U  U  T)  V  ’ 

yields  the  result  we  seek:  any  solution  to  Equation  15  in  one  coordinate  system  is  also  a  solution  in  any 
orthonormally  related  coordinate  system.  Since  the  eigenvalues  arc  proportional,  the  smallest  eigenvalues  in 
the  two  coordinate  systems  correspond. 

Appendix  C:  Hill  Climbing  Method 

Hill  climbing  refers  to  a  class  of  numerical  methods  that  minimize  a  function  (7(U).  where  U  may  be  a  n- 
dimcnsional  vector.  For  our  purposes,  we  may  assume  the  existence  of  a  subroutine  mini  that  minimizes  G 
along  a  straight  line.  It  accepts  an  initial  estimate  UQ  and  an  increment  AU.  finds  a  value  of  k  that  locally 
minimizes  t7(E'0  +  k  AU),  and  updates  U()  to  tire  new  minimizing  value.  Different  hill-climbing  strategics 
consist  of  different  means  of  selecting  a  sequence  of  AU  vectors.  The  sequence  terminates  w  hen  no  further 
improvement  in  (Jean  be  obtained. 

A  method  that  requires  no  knowledge  about  the  function  G  is  to  search  sequentially  along  the  n  dimensions 
of  U.  i.e..  to  apply  the  sequence 


’  1 

’  0 

’  0 

'  1 

AU  = 

0 

’ 

l 

. 

0 

■ 

0 

.  0  . 

.  0  . 

.  I  . 

.  0 

This  is  one  form  of  the  method  of  steepest  descent.  For  some  functions  this  method  will  suffice.  But  if  the 
function  G  is  ill-conditioned,  that  is  if  the  elements  of  U  interact  to  a  great  degree  in  their  influence  on  <?(U). 
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or  if  the  cquipotcntials  of  G  tend  to  form  squashed  ellipsoids,  then  tins  simple  approach  will  converge  very 
slow’-.  Figure  16  shows  a  hypothetical  sequence  of  iterations  in  minimizing  a  function  of  two  variables. 


Figure  16:  Iterations  in  Method  of  Steepest  Descent 

Convergence  can  be  enhanced  by  keeping  track  of  the  cumulative  change  in  U  as  the  minimizadon 
proceeds.  After  n  minimizations  along  the  n  coordinate  directions  of  U.  an  additional  minimization  step  can 
be  attempted  along  the  direction  indicated  by  the  sum  of  the  individual  All  terms  measured  in  the 
preceding  n  calls  to  mini.  This  is  called  the  method  of  steepest  descent  with  accelerated  convergence. 

Some  improvement  in  performance  can  be  obtained  if  it  is  possible  to  evaluate  the  gradient  of  G.  that  is, 
the  n  partial  derivatives  of  G  with  respect  to  the  elements  of  U.  At  each  minimization  step  let  All  point  in  the 
direction  of  the  gradient.  Use  of  the  gradient  can  give  a  computational  advantage  in  reducing  the  number  of 
calls  to  mini,  but  it  is  doubtful  whether  this  technique  affects  overall  convergence  properties. 

The  situation  illustrated  in  Figure  16  can  be  completely  avoided  if  the  second  partials  of  Garc  available.  In 
the  neighborhood  of  U'0,  G(U)  may  be  approximated  by  the  expression 

GAU)  =  G0  +  Dr (U  -  Uq)  +  (U  -  u/  P (U  -  (25) 

where  D  =  D  (Uq)  is  the  gradient  vector,  or  vector  of  first  partial  derivatives,  and  P  =  P  (L’0)  is  the  matrix  of 
second  partial  derivatives.  P  is  a  symmetric  matrix.  An  eigenvalue  analysis  of  P  will  give  n  linearly 

independent  eigenvectors  {U,.  i=  1 . /i}  and  associated  eigenvalues  f=  1 . n}  such  that 

PU  =  A  U. 

i  i  i 

These  eigenvectors  point  in  the  directions  of  the  principal  axes  of  the  equipotential  ellipsoids  of  G.  Function 
minimization  may  take  place  in  the  eigenvector  directions  independently  without  cross-coupling  or  co- 
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variance  effects.  Convergence  will  be  quite  rapid. 


If  die  eigenvectors  arc  normalized  to  unit  magnitude,  and  we  let  U  =  UQ  +  k  11,  then  Kquauon  Ij 
becomes 

<7(U)  s  G0  +  k  Dt  U.  +  \  k2  .  (26) 

Taking  die  derivative  with  respect  to  k  and  setting  the  result  equal  to  zero,  we  find  that  the  minimum  ought  to 

T 

occur  when  k  -  D  If  /  (  2  k  Xj ).  If  \  is  negative,  as  it  frequendy  turns  out  to  be,  then  the  k  above  actually 
points  to  a  relative  maximum.  This  result  can  be  used  to  guide  die  minimization  by  subroutine  mini,  to 
suggest  initial  step  size  for  die  search,  but  experience  show  s  that  the  use  of  MINI  should  not  be  bypassed. 


For  die  ease  at  hand,  ellipses  are  represented  by 
a  (.x-  h )2  +  p  (x-  h)  (y-  k)  +  y  (y-  k)2  =  1 , 

or 

a  x2  +  P  xy  +  y  y2  -  ( 2ah  +  pk )  x  -  (Ph  +  2yk)y  4-  a/i2  +  phk  +  yk2  -1=0. 
Therefore  let 


a 

a 

p 

p 

y 

and  V(U)  = 

y 

h 

-2ah-pk 

.  k  j 

-Ph-2yk 

l.  ah~+phk+ylr-l  . 

and  let  X,  X^,  and  Xv  be  defined  as  before.  fr(U)  is  given  by 
7  (V 

6(U)=-  =  SF2  =  2- 
D 

where 

N  =  VTX  XT  V  =  (  XTV)2 

D  =  \r(\  Xj  +  Xv  X/  )  V  =  (  X  J  V  )2  +  (XvT  V  )2 . 

\  \  y  y  h  y 


The  first  and  second  partial  derivatives  of  .V  and  D  widi  respect  to  the  elements  of  U  arc: 


VN  =  2 


Fx2 
Fxv 
iy 2 

-  FF„ 


I 


J 
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72N  =  2 

r  *4 

x’2/2 

-  F  x’2  — 2Fx’ 

.  F/2 

-  f\xy  -  Fy 

—  F  xy  -  Fx' 

y 

x’2/2 

xy2 

-F/^-2Fy 

-Fx'2-2Fx' 

-F/y-/y 

-  F y2 

F  2  4- 2  Fa 
!\Fy  +  F0 

- Fx 2  1 

V 

-Fvx'/-FFx' 
-F/-2Fy 
FT  +  F/? 

\  ->y  ^ 

F-  +  2Fy 

7D  =  2 

2Fxx 

Fxy  +  F  x' 
IF/  . 
-2  Fx«-FJi 
l-Fxp-2fy  J 

v2D  =  2 

’  4x'2 

2.x'/ 

0 

-4jca-2F 

-2x'/3 

4.x’/ 

/2  +  x’2 

2x*/ 

-2/  a-x'fi  -2F 
-y'fi-Fx-2xy 

0 

2x’/ 

4/ 2 
-2 y'0 
-4/y-2Fy 

-4x’a-2F 

-2ya~x'/3-F 

-2yP 

4a2  +  ip2 
2a/?  +  2/Fy 

-2x'y3 

-y'P-Fx- 2xy 
-4/y-2F 

2af$  +2fiy 
2p2+4y2 

where 

» _  _ 

x'=  x  —  h 
y=  y  -k 

F=  a  xl  +  (i  xy'  +  y  y-  1 
F  =  2  a  x’+  P  / 

Fy=/?x’+  2y/. 

The  first  and  second  partials  of  £  can  be  derived  from  the  partials  of  N  and  D  by  use  of  the  formulas 
9  N  DN  -  NDn 

- =  _ 0 - B 

dp  D  D2 

d2  N  2ND  D  ~  Nn  D  D  —  N  D  D  +  /V  D2  +  N  D  D 
_ =  _ a — o - s _ a - a - a - bo _ bb 

dpdq  D  if 

where  p  and  q  stand  for  any  of  the  set  {a,  j3,  y,  h,  and  k},  and  subscripting  denotes  taking  the  partial 
derivative.  The  derivative  of  a  sum  is  equal  to  the  sum  of  the  derivatives  of  the  terms.  Hence,  the  partial 
derivatives  of  Z  are  the  sums  of  the  partial  derivatives  of  the  individual  £.  terms. 
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